Skip to content

Conversation

@elfkuzco
Copy link
Collaborator

@elfkuzco elfkuzco commented Oct 20, 2025

This PR represents a fundamental change in the operation of the WP1 development server. The entire Flask app (referred to generally as "web", the server running at localhost:5000) is pushed into the docker compose graph. It is joined there by instances of all the images and services needed by the Zimfarm. There are several associated configuration and logic changes as well.

Taken together, this means that a WP1 developer will be able to run the selection -> ZIM process end to end, without requiring any credentials on a live Zimfarm.

Additional Rationale

As part of cleanup in zimfarm API (openzim/zimfarm#1391), requests to create recipes/tasks now require an offliner definition version. This PR sets the version of the offliner definition from env variable and sets up zimfarm containers in a docker-compose graph. Previously, the API used "initial" as the definition versions but as scrapers evolve and arguments change, the definitions change too.

Changes

  • use mwoffliner definition version from env (default to image tag)
  • set up compose graph that includes zimfarm-containers. These are created with profiles: zimfarm and zimfarm-worker. The former starts up only the API and UI while the latter starts up the worker and receiver in addition.

@elfkuzco elfkuzco force-pushed the pass-offliner-definition-version branch from 62d06b6 to de7afb9 Compare October 20, 2025 14:58
@elfkuzco elfkuzco force-pushed the pass-offliner-definition-version branch from 8176974 to fc088ff Compare October 20, 2025 18:36
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 80.64516% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.75%. Comparing base (63b7a74) to head (2a5b7f2).
⚠️ Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
wp1/logic/builder.py 83.87% 5 Missing ⚠️
wp1/web/builders.py 37.50% 5 Missing ⚠️
wp1/zimfarm.py 88.88% 2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (80.64%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1027      +/-   ##
==========================================
- Coverage   92.90%   92.75%   -0.16%     
==========================================
  Files          73       73              
  Lines        4229     4249      +20     
==========================================
+ Hits         3929     3941      +12     
- Misses        300      308       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@audiodude audiodude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this approach.

Copy link
Member

@audiodude audiodude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also update the name of this PR to something like "Integrate Zimfarm dev setup"?

@audiodude audiodude changed the title pass offliner definition version while creating tasks Integrate zimfarm dev setup Oct 27, 2025
@elfkuzco elfkuzco requested a review from benoit74 October 28, 2025 00:07
Copy link
Contributor

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

I also feel like nobody did ran this from end-to-end since the zimfarm worker resources were not adequate, or do I miss something?

We should really run this from end-to-end to ensure this setup works correctly.

And by end-to-end, I mean / propose following scenario:

  • user log in WP1 UI
  • user creates a simple selection (one or two WPEN article for instance, this detail is not important)
  • user requests this selection to be ZIMed
  • ZIM file is correctly created
  • WP1 UI displays ZIM location
  • user can download the ZIM

I totally understand that @elfkuzco might need help from @audiodude and myself regarding details on how to run this end-to-end test, but we should not close this issue / PR until we are sure that everything works from end-to-end. Otherwise this is mostly just a waste.

@audiodude
Copy link
Member

We should really run this from end-to-end to ensure this setup works correctly.

Yes I can definitely help with that. I'll patch this PR and try setting up/running the zimfarm locally and confirm that I can create and download ZIMs.

@elfkuzco
Copy link
Collaborator Author

Updated the files with the recent changes:

  • added separate buckets for artifacts, logs and zims
  • updated the README to detail the worker resources and reason for the offliner definition
  • updated worker resources to 3 CPU, 20G RAM, 20G disk

@elfkuzco elfkuzco requested a review from benoit74 October 30, 2025 05:41
@benoit74
Copy link
Contributor

Code LGTM, waiting for e2e test from @audiodude (if I get it correctly) to give my formal approval

@audiodude
Copy link
Member

audiodude commented Oct 30, 2025

I made some minor tweaks to the PR, but it's still not working. My Zimfarm is still reporting the following for requests to http://localhost:8004/v2/schedules:

{"success":false,"message":"Offliner definition for offliner mwoffliner with version 1.17.2 does not exist"}

EDIT: This is after following the directions in the README and updating my local credentials.py

@benoit74
Copy link
Contributor

Hum, this is indeed a problem. To unblock you, please set 'definition_version': 'dev' in your local credentials.py, it should do the trick.

It is however not the proper way to solve this situation to merge this PR. We will continuously have new offliner definitions arriving, and all of them should be stored in the local Zimfarm DB so that dev can use mostly any mwoffliner version / definition version. I feel like the docker/zimfarm/create_offliners.sh should fetch all existing definitions from api.farm.openzim.org and populate the ones missing in local dev DB. Documentation would then state that developers should rerun this script on a regular basis to fetch new offliner definitions if they want to use them in their credentials.py.

@audiodude
Copy link
Member

@elfkuzco Looking at this again now. I tried to update some of the credentials*example files per @benoit74's suggestion.

I'm still getting failures in my Zimfarm:

[error] [2025-11-22T00:15:46.075Z] Failed to read articleList from [http://wp1bot-web-dev:5000/v1/builders/80a9cf73-0174-4740-a645-a8768cac6534/selection/latest.tsv] Error: Failed to read articleList from URL: http://wp1bot-web-dev:5000/v1/builders/80a9cf73-0174-4740-a645-a8768cac6534/selection/latest.tsv

I have this in my credentials.py.dev:

        'CLIENT_URL': {
            ...
            "backend": "http://wp1bot-web-dev:5000",
        },

I also tried it with the following:

        'CLIENT_URL': {
           ...
            "backend": "http://dev-web:5000",
        },

And still got:

[error] [2025-11-22T00:20:47.439Z] Failed to read articleList from [http://dev-web:5000/v1/builders/823b0918-4f57-44e3-86c4-0812501cc75c/selection/latest.tsv] Error: Failed to read articleList from URL: http://dev-web:5000/v1/builders/823b0918-4f57-44e3-86c4-0812501cc75c/selection/latest.tsv

as a reminder, this is the stanza in docker-compose-dev.yml:

  dev-web:
    build:
      context: .
      dockerfile: docker/web/Dockerfile
    container_name: wp1bot-web-dev
    environment:
      - FLASK_DEBUG=1
      - FLASK_RUN_HOST=0.0.0.0
    command: flask --app wp1.web.app run
    networks:
      - wp1bot-dev
    ports:
      - 5000:5000
    volumes:
      - ./wp1/credentials.py.dev:/usr/src/app/wp1/credentials.py
    links:
      - redis
    restart: always
    depends_on:
      redis:
        condition: service_healthy

@elfkuzco
Copy link
Collaborator Author

elfkuzco commented Nov 24, 2025

Can I see your logs for the task worker and worker manager? My suspicion is that this is probably an older image of the task worker which still uses the dnscache. After openzim/zimfarm#1515, ENVIRONMENT=development should skip starting a DNS cache which is what makes it impossible to contact the wp1 backend from the scraper. The logs are probably the only way to tell which version you are on since the zimfarm images do not have any tags.

@audiodude
Copy link
Member

Did a --pull always and re pulled and rebuilt all the images. Still failed with the same message. Here are the logs:

[2025-11-24 04:00:50,533: INFO] API is offering 1 task(s): ['5c66a411-f646-4e19-9598-62e73acc880d']
[2025-11-24 04:00:50,533: DEBUG] start_task: 5c66a411-f646-4e19-9598-62e73acc880d@zimfarm-api:80
[2025-11-24 04:00:50,606: DEBUG] update_task_data: 5c66a411-f646-4e19-9598-62e73acc880d@zimfarm-api:80
[2025-11-24 04:00:50,642: DEBUG] start_task_worker: 5c66a411-f646-4e19-9598-62e73acc880d
[2025-11-24 04:00:50,643: DEBUG] getting image ghcr.io/openzim/zimfarm-task-worker:latest
[2025-11-24 04:00:51,353: DEBUG] running ['task-worker', '--task-id', '5c66a411-f646-4e19-9598-62e73acc880d', '--webapi-uri', 'http://zimfarm-api:80/v2']
[2025-11-24 04:02:31,563: INFO] container zimtask_5c66a exited successfuly, removing.
[2025-11-24 04:02:31,592: INFO] task 5c66a411-f646-4e19-9598-62e73acc880d@zimfarm-api:80 is not running anymore, unwatching.
[2025-11-24 04:02:31,592: DEBUG] polling http://zimfarm-api:80/v2…
[2025-11-24 04:02:31,640: INFO] API is offering 1 task(s): ['eeeff927-8130-489e-b41f-8c54000b4632']
[2025-11-24 04:02:31,640: DEBUG] start_task: eeeff927-8130-489e-b41f-8c54000b4632@zimfarm-api:80
[2025-11-24 04:02:31,681: DEBUG] update_task_data: eeeff927-8130-489e-b41f-8c54000b4632@zimfarm-api:80
[2025-11-24 04:02:31,698: DEBUG] start_task_worker: eeeff927-8130-489e-b41f-8c54000b4632
[2025-11-24 04:02:31,699: DEBUG] getting image ghcr.io/openzim/zimfarm-task-worker:latest
[2025-11-24 04:02:32,792: DEBUG] running ['task-worker', '--task-id', 'eeeff927-8130-489e-b41f-8c54000b4632', '--webapi-uri', 'http://zimfarm-api:80/v2']

@audiodude
Copy link
Member

Actually, I don't think it's a problem with Zimfarm, because when I paste http://wp1bot-web-dev:5000/v1/builders/04c0f118-f9e5-449a-a8de-f05026555482/selection/latest.tsv and change the server to http://localhost:5000 I get redirected to https://localhost:9000/org-kiwix-dev-wp1/selections/wp1.selection.models.simple/188cd764-240c-425e-9cf8-f19646df10f7/ZIMTest14.tsv with an SSL error. I think the redirect to minio is what's failing.

@elfkuzco
Copy link
Collaborator Author

Oh, don't use https with the minio URL.

@audiodude
Copy link
Member

Okay I've updated my credentials and things seem to be working now, except for openzim/mwoffliner#2578 so I actually haven't seen a full e2e manual test yet (the Zimfarm has been running for 30+ minutes for my 1 article).

@audiodude
Copy link
Member

Oh, don't use https with the minio URL.

Yes I fixed that and some other problems with my credentials. This credentials.py thing is such a nightmare, I can't wait to be done with it.

@audiodude audiodude force-pushed the pass-offliner-definition-version branch from af21eae to 544ecb6 Compare November 24, 2025 05:04
@elfkuzco
Copy link
Collaborator Author

so I actually haven't seen a full e2e manual test yet (the Zimfarm has been running for 30+ minutes for my 1 article).

Could you pin to a specific version: See https://api.farm.openzim.org/v2/offliners/mwoffliner/versions? I can't remember which version I pinned to but you could pin to 1.17.2. Also, I used a simple selection Statue_of_Liberty. Hopefully, that works till the upstream issue is resolved.

@audiodude
Copy link
Member

Actually I think there was a bug in my code where I was passing a null articleList. I'm trying again, I'll let you know if I still have problems getting it to work.

@audiodude audiodude dismissed benoit74’s stale review December 1, 2025 03:18

I have tested this end to end in the development environment, and was successfully able to create a valid ZIM.

@audiodude audiodude removed the request for review from benoit74 December 1, 2025 03:20
@audiodude audiodude force-pushed the pass-offliner-definition-version branch from 34c924e to 96a75b8 Compare December 1, 2025 03:39
@audiodude audiodude merged commit e97011b into main Dec 1, 2025
4 of 5 checks passed
@audiodude audiodude deleted the pass-offliner-definition-version branch December 1, 2025 04:26
@benoit74
Copy link
Contributor

benoit74 commented Dec 1, 2025

@audiodude do you intend to deploy this soon? It contains changes for production which would enable us to merge openzim/zimfarm#1422 and finally cleanup zimfarm codebase from transitory things around offliner definition versions.

Do not forget you need to add the new 'definition_version': '1.17.3', key in production credentials before deploying.

@audiodude
Copy link
Member

@benoit74 This has been deployed with the updated config.

I have https://farm.zimit.kiwix.org/recipes/wp1_selection_883c769321f6dd39 which was created from WP1

@audiodude
Copy link
Member

I was able to successfully download the ZIM created from WP1

@benoit74
Copy link
Contributor

benoit74 commented Dec 2, 2025

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants